Everything about Written Chinese totally explained
Written Chinese comprises the written symbols used to represent
spoken Chinese and the rules about how they're arranged and punctuated. These symbols are commonly known as
Chinese characters (
traditional/
simplified Chinese: 漢字/汉字;
pīnyīn: hànzì), many of which have been traced back to the
商 Shāng Dynasty about 1500
BCE. The process of creating characters probably began some centuries earlier.]]
Written Chinese developed to represent spoken Chinese. At the inception of written Chinese, spoken Chinese was a monosyllabic language; that is, Chinese words represented independent concepts (objects, actions, relations, and so forth) that were generally only one syllable in the spoken language.
The Chinese language has since diversified into many dialects and these dialects have become polysyllabic. As a result, many old syllables no longer stand on their own, in the same way that
pre- (Latin prefix meaning "earlier") can't typically be used on its own as an English word. However, because the meanings of modern Chinese words can usually be analyzed in terms of the old Chinese syllables that constitute them, written Chinese has been continuously used to represent individual Chinese syllables. Each of these syllables represents a
morpheme, or
semantic unit, so written Chinese is generally (though not universally) considered to be
logographic. At least one scholar considers it a large, inefficient phonetic script.
Chinese dialects vary not only by pronunciation, but to a lesser degree also by vocabulary and
syntax, so a single written Chinese standard can't represent all dialects equally well. Modern written Chinese, which became the written standard as an indirect result of the
May Fourth Movement of 1919, isn't technically bound to any single dialect; however, it most nearly represents the vocabulary and syntax of
Mandarin, by far the most widespread Chinese dialect in terms of both geographical area and number of speakers. This version of written Chinese is called
Vernacular Chinese, or 白話/白话 báihuà (literally, "white speech").
Before the development of Vernacular Chinese, the prevailing written standard was a vocabulary and syntax rooted in Chinese as spoken around the time of
Confucius (about 500 BCE), called
Classical Chinese, or 文言文 wényánwén. Over the centuries, Classical Chinese gradually acquired features from various dialects. This accretion was generally slow and minor, so that just before it was supplanted by Vernacular Chinese, Classical Chinese was distinctly different from any contemporary dialect.
Classical Chinese retained much of the vocabulary and syntax of the two-millennia-old version of spoken Chinese it was derived from, so it was taught separately from any native dialect. Once learned, however, it was a common medium for communication between people speaking different dialects—dialects that often came to be mutually unintelligible by the end of the first millennium CE. A Mandarin speaker might say yī, a
Cantonese yat, and a
Hokkienese tsit, but all three will understand the character 一 "one". Despite its ties to the dominant Mandarin dialect, Vernacular Chinese serves the same function to a degree, limited by the fact that Vernacular Chinese expressions are often ungrammatical or unidiomatic in many of the non-Mandarin dialects. This role may not differ substantially from the role of other
lingua francas such as
Latin: For those trained in written Chinese, it serves as a common medium; for those untrained in it, the graphic nature of the characters is in general no aid to common understanding (characters such as "one" notwithstanding).
The variation in vocabulary among dialects has also led to the informal use of "dialectal characters", as well as standard characters that are nevertheless considered archaic by today's standards. Cantonese is unique among non-Mandarin regional languages in having a written colloquial standard, used in Hong Kong and overseas, with a large number of unofficial characters for words particular to this dialect.
Written colloquial Cantonese has become quite popular in online chat rooms and instant messaging, although for formal written communications Cantonese speakers still normally use standard written Chinese.
Chinese characters in other languages
man'yōgana, which used a small set of Chinese characters to help indicate pronunciation. The man'yōgana later developed into the phonetic alphabets,
hiragana and
katakana.
The Chinese characters imported into Japanese were called hànzì, after the
漢/汉 Hàn Dynasty of China; in Japanese, this was pronounced
kanji. In modern written Japanese, kanji are used for nouns, verb stems, and adjective stems, while the hiragana are used for prefixes and suffixes. The katakana are used exclusively for sound symbols, and for loans from other languages. The
Jōyō Kanji, a list of kanji for common use standardized by the Japanese government, contains 1,945 characters—about half the number of characters commanded by literate Chinese.
The role of Chinese characters in Korean and Vietnamese, in contrast, is much more limited. At one time, many Chinese characters (called
hanja, a term cognate to both hànzì and kanji) were introduced into Korean for their meaning, just as in Japanese.
Structure of Chinese characters
Written Chinese is the only major modern writing system not based predominantly on an alphabet or a compact syllabary. Instead, Chinese characters are
glyphs whose parts may depict objects or represent abstract notions. These parts may occasionally stand alone as independent characters; more usually, they're combined, using a variety of different principles, to form more complex characters. The best known exposition of Chinese character composition is the
說文解字/说文解字 Shuōwén Jiězì, compiled by
許慎/许慎 Xǚ Shèn around 120 CE. Since Xǚ Shèn didn't have access to Chinese characters in their earliest forms, his analysis, based as it's on somewhat later forms, can't be taken as authoritative. Nonetheless, no later work has supplanted the Shuōwén Jiězì in terms of breadth, so it remains the most accessible source for non-specialists, via its various redactions.
According to the Shuōwén Jiězì, Chinese characters are developed on six basic principles. (These principles, though popularized by the Shuōwén Jiězì, were developed earlier; the oldest known mention of them is in the
周禮/周礼 Zhōulǐ—literally, "Rites of Zhou"—a text from about 150 BCE.) The first two principles produce simple characters, known as 文 wén: In fact, some phonetic complexes were originally simple pictographs that were later augmented by the addition of a semantic root. An example is 炷 zhù "candle", which was originally a pictograph 主, a character that's now pronounced zhǔ and means "host". The character 火 huǒ "fire" was added to indicate that the meaning is fire-related.
The last two principles don't produce new written forms; instead, they transfer new meanings to existing forms:
Whenever writers of the Chinese encounter a new concept or object, they combine characters to signal the new object. For instance, when the Chinese discovered
giraffes, they used the word
cháng jǐng lù (長頸鹿/长颈鹿), meaning "long neck deer," as the name for a giraffe.
Chinese written forms
typeface or
font for alphabetic languages. Today, there are five recognized written traditions for
Chinese writing style:
- 篆書/篆书 zhuànshū: Seal script, which represents the oldest forms of Chinese characters surviving to modern use. They are used principally for signature seals, or chops, which are often used in place of a signature, for Chinese documents and artwork.
- 隸書/隶书 lìshū: Clerical script, which was developed during the Western Han dynasty (206 BCE–8 CE). Like seal script, clerical script is in limited use (often in restaurant menus) and has a distinctively antiquated appearance.
- 行書/行书 xíngshū: Running script, a semi-cursive form, in which the character parts begin to run into each other, although the characters themselves generally remain separate. There are many conventions in which some characters deviate from their canonical forms in a consistent manner.
- 草書/草书 cǎoshū: Grass script, a fully cursive form, in which the characters are often entirely unrecognizable by their canonical forms. Grass script gives the impression of anarchy in its appearance, and there's indeed considerable freedom on the part of the calligrapher, but this freedom is circumscribed by conventional "abbreviations" in the forms of the characters.
- 楷書/楷书 kǎishū: Regular script, a non-cursive form, in which each stroke of each character is clearly drawn out from the others. Even though both the running and grass scripts appear to be derived as semi-cursive and cursive variants of regular script, it's in fact the regular script that was the last to develop.
|
|
|
|
|
| Seal |
Clerical |
Running (semi-cursive) |
Grass (fully cursive) |
Regular (non-cursive) |
Regular script is considered the archetype for Chinese writing, and forms the basis for most printed forms. In addition, regular script imposes a
stroke order, which must be followed in order for the characters to be written correctly. (Strictly speaking, this stroke order applies to the clerical, running, and grass scripts as well, but especially in the running and grass scripts, this order is occasionally deviated from.) Thus, for instance, the character 木 mù "wood" must be written starting with the horizontal stroke, drawn from left to right; next, the vertical stroke, from top to bottom, with a small hook toward the upper left at the end; next, the left diagonal stroke, from top to bottom; and lastly the right diagonal stroke, from top to bottom.
Earlier forms
The seal script, although the earliest surviving form of Chinese writing, doesn't represent the embryonic stage of Chinese writing. The first indisputable examples of Chinese writing, dating back to the Shāng Dynasty in the latter half of the second millennium BCE, were the
oracle bones (primarily ox scapulae and turtle shells), used for divination. Characters were inscribed on the bones in order to frame a query; the bones were then heated over a fire, and the resulting cracks were interpreted to determine the answer to the query. Such characters are called 甲骨文 jiǎgǔwén "shell-bone script" or
oracle bone script.
After the Shāng Dynasty, Chinese writing evolved into the form found on bronzeware made during the Western
周 Zhōu Dynasty (c 1066–770 BCE) and the
Spring and Autumn Period (770–476 BCE), a kind of writing called 金文 jīnwén "metal script". Jīnwén characters are more regular and angular than the embellished script of the oracle bone script. Later, in the
Warring States Period (475–221 BCE), the script became still more regular, and settled on a form, called 六國文字/六国文字 liùguó wénzì "script of the six states", that Xǔ Shèn used as source material in the Shuōwén Jiězì. These characters were later embellished and stylized to yield the seal script characters, which in turn evolved into the other surviving writing styles.
Simplified and traditional Chinese
In the 20th century, written Chinese divided into two canonical forms, called 簡體字/简体字 jiǎntǐzì (
simplified Chinese) and 繁體字/繁体字 fántǐzì (
traditional Chinese). Simplified Chinese was developed in the
People's Republic of China (mainland China) in order to make the characters faster to write (especially as some characters had as many as a few dozen strokes) and easier to memorize. The People's Republic of China has claimed that both goals have been achieved, but some external observers disagree. Little systematic study has been conducted on how simplified Chinese has affected the way Chinese people become literate; the only studies conducted before it was standardized in mainland China seem to have been statistical ones regarding how many strokes were saved on average in samples of running text.
The simplified forms have also been criticized for being inconsistent. For instance, traditional 讓 ràng "allow" is simplified to 让, in which the phonetic on the right side is reduced from 17 strokes to just three. (The speech radical on the left has also been simplified.) However, the same phonetic is used in its full form, even in simplified Chinese, in such characters as 壤 rǎng "soil" and 齉 nàng "snuffle"; these forms remained uncontracted because they were relatively uncommon and would therefore represent a negligible stroke reduction. On the other hand, some simplified forms are simply calligraphic abbreviations of long standing, as for example 万 wàn "ten thousand", for which the traditional Chinese form is 萬.
Simplified Chinese is standard in the People's Republic of China,
Singapore, and
Malaysia. Traditional Chinese is retained in
Hong Kong, the
Republic of China (Taiwan), and
Macau. Throughout this article, Chinese text is given in both simplified and traditional forms when they differ, with the traditional forms being given first.
Layout of written Chinese
Chinese characters conform to a roughly square frame and are not usually linked to one another, so they could be written in any direction in a square grid. Traditionally, Chinese is written in vertical columns from top to bottom; the first column is on the right side of the page, and the text runs toward the left. Text written in Classical Chinese also uses little or no
punctuation. In such cases, sentence and phrase breaks are determined by context and rhythm.
In modern times, the familiar Western layout of horizontal rows from left to right, read from the top of the page to the bottom, has become more popular, especially in the People's Republic of China, with the rise of Vernacular Chinese; the government of the People's Republic of China mandated left-to-right writing in 1955. Punctuation has also become more prevalent, whether the text is written in columns or rows. The punctuation marks are clearly influenced by their Western counterparts, although some marks are particular to Chinese: for example, the double and single quotation marks (『 』 and 「 」); the hollow period (。), which is otherwise used just like an ordinary full stop; and a special kind of comma called an
enumeration comma (、), which is used to separate items in a list, as opposed to clauses in a sentence.
Signs are often a particularly challenging aspect of written Chinese layout, since they can be written either left to right or right to left (the latter can be thought of as the traditional layout with each "column" being one character high), as well as from top to bottom. It isn't unusual to encounter all three orientations on signs on neighboring stores. However, in 2004, Taiwan mandated a Western, left-to-right layout of Chinese for most texts (excluding arts and literature).
Literacy
Because the majority of modern Chinese words contain more than one character, there are at least two measuring sticks for Chinese literacy: the number of characters known, and the number of words known.
John DeFrancis, in the introduction to his
Advanced Chinese Reader, suggests that a typical Chinese college graduate recognizes perhaps 4,000 to 5,000 characters, and 40,000 to 60,000 words. Jerry Norman, in
Chinese, places the number of characters somewhat lower, at 3,000 to 4,000.
These counts are complicated by the tangled development of Chinese characters. In many cases, a single character came to be written in multiple ways, as with English "color/colour". This latter development was stemmed to an extent during the Qín dynasty, when
李斯 Lǐ Sī promulgated the seal script as the standard throughout the newly unified Chinese empire, but soon started again. Although the Shuōwén Jiězì lists 10,516 characters—9,353 of them unique (some of which may already have been out of use by the time it was compiled) plus 1,163 graphic variants—the 集韻/集韵
Jíyùn of the Northern
宋 Sòng Dynasty, compiled in 1039, contains no fewer than 53,525 characters, most of them graphic variants.
Chinese dictionaries
lexically ordered, as English dictionaries are, for instance. The need to arrange Chinese characters in order to permit efficient lookup has given rise to a considerable variety of ways to organize and index the characters.
A traditional mechanism is the method of radicals, which uses a set of character roots. These roots, or radicals, generally but imperfectly align with the parts used to compose characters by means of logical aggregation and phonetic complex. A canonical set of 214 radicals was developed during the rule of the
康熙 Kāngxī emperor (around the year 1700); these are sometimes called the Kāngxī radicals. The radicals are ordered first by stroke count (that is, the number of strokes required to write the radical); within a given stroke count, the radicals also have a prescribed order.
Every Chinese character falls under the heading of exactly one of these 214 radicals.
The advantage of this method is that one need not know how to pronounce a character before looking it up; the entry, once located, usually gives the pronunciation. A disadvantage is that which of the various roots of a character is the proper radical isn't always immediately obvious. Accordingly, dictionaries often include a list of hard to locate characters, indexed by total stroke count, near the beginning of the dictionary.
Other methods of organization exist, often in an attempt to address the shortcomings of the radical method, but are less common. An exhaustive list isn't possible; however, a selection follows:
By pronunciation: Characters are listed in lexical order by pronunciation, expressed typically in either 漢語拼音/汉语拼音 hànyǔ pīnyīn or 注音符號/注音符号 zhùyīn fúhào. It is common for a dictionary ordered principally by the Kāngxī radicals to have an auxiliary index by pronunciation; this index points to the page in the main dictionary where the desired character can be found.
The four-corner method: This method uses the fact that most characters fit into a roughly square shape. Characters are indexed according to the kinds of strokes located nearest the four corners (hence the name of the method). One doesn't need to know which part of the character constitutes the radical in order to use this method.
The 倉頡/仓颉 Cāngjié method: Characters are built up using a set of 24 basic components, which map more or less conveniently to the letters on a keyboard (this method was originally developed to aid computer input). The entire character is used, so as with the four-corner method, one doesn't need to identify the proper radical. However, many of the 24 character components have variant forms, and these must be memorized in order to use Cāngjié effectively.
Transliteration and romanization
Transliteration wasn't always considered merely a way to record the sounds of any particular dialect of Chinese; it was once also considered a potential replacement to the millennia-old script. This was first prominently proposed during the May Fourth Movement, and it gained further support with the victory of the Communists in 1949. Immediately afterward, the mainland government began two parallel programs relating to written Chinese. One was the development of an alphabetic script, and the other was the simplification of the traditional characters—a process that would eventually lead to simplified Chinese. The latter wasn't viewed as an impediment to the former; rather, it would ease the transition toward the exclusive use of an alphabetic (or at least phonetic) script.
By 1958, however, priority was given officially to simplified Chinese; a phonetic script, called pīnyīn, had been developed, but its deployment to the exclusion of simplified characters was pushed off to some distant future date. The tight binding between Mandarin and pīnyīn (in full, hànyǔ pīnyīn, to distinguish it from other pīnyīn systems, but abbreviated to pīnyīn hereafter) may have contributed to this deferment. It seems unlikely that pīnyīn will supplant Chinese characters anytime soon as the sole means of representing Chinese.
Pīnyīn uses the Latin alphabet, along with a few diacritical marks, to represent the sounds of Mandarin in standard pronunciation. For the most part, pīnyīn uses vowel and consonant letters as they're used in Romance languages (and also in IPA). However, although 'b' and 'p', for instance, represent the voice/unvoiced distinction in some languages, such as English, they represent the unaspirated/aspirated distinction in Mandarin; Mandarin has few voiced consonants. All transliterations in this article use the pīnyīn system.
Further Information
Get more info on 'Written Chinese'.
|
External Link Exchanges
Do you know how hard it is to get a link from a large encyclopaedia? Well we're different and will prove it. To get a link from us just add the following HTML to your site on a relevant page:
<a href="http://written_chinese.totallyexplained.com">Written Chinese Totally Explained</a>
Then simply click through this link from your web page. Our crawlers will verify your link, extract the title of your web page and instantly add a link back to it. If you like you can remove the words Totally Explained and embed the link in article text.
As long as your link remains in place, we'll keep our link to you right here. Please play fair - our crawlers are watching. Your site must be closely related to this one's topic. Any kind of spamming, dubious practises or removing the link will result in your link from us being dropped and, potentially, your whole site being banned. |